Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLB map setup per chip #179

Merged
merged 13 commits into from
Oct 23, 2024
Merged

TLB map setup per chip #179

merged 13 commits into from
Oct 23, 2024

Conversation

broskoTT
Copy link
Contributor

@broskoTT broskoTT commented Oct 17, 2024

Currently tt_SiliconDevice accepts one map for everything. But that goes against our effort to separate cluster vs chip responsibilities, related to #157
This change includes:

  • Adding chip_id to setup_core_to_tlb_map. There should be a map per mmio chip
  • Started a chip api tests file
  • Added an api example/test on how TLB setup functions at the moment. This also does some minor configuration testing, whether it throws or not.
  • Minor cosmetic changes to other API tests

Contributes to #154 since it adds more api tests.
This change will require tt_metal changes.

@joelsmithTT
Copy link
Contributor

I will give this PR a more thorough review, but my initial reaction is that the relationship between TLB index and (x,y) location is something UMD should hide.

UMD has to know about the TLB windows, there's no way around that. But the application does not. It shouldn't matter what TLB is mapped where as long as UMD/KMD manage these resources effectively.

@broskoTT broskoTT added the changes api API changing PR, needs changes in client code label Oct 18, 2024
@broskoTT
Copy link
Contributor Author

I will give this PR a more thorough review, but my initial reaction is that the relationship between TLB index and (x,y) location is something UMD should hide.

UMD has to know about the TLB windows, there's no way around that. But the application does not. It shouldn't matter what TLB is mapped where as long as UMD/KMD manage these resources effectively.

I agree, but bear in mind that this is not what this PR is about. This is not a final API, it is just pointing in the right direction, that TLB setup is chip specific. We might end up removing it altogether in the end

Copy link
Contributor

@joelsmithTT joelsmithTT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the intent behind the change, but it expands an existing pattern that I regard as a core UMD design flaw: the pervasive use of hash tables for per-chip information lookup. This pattern - which exists in UMD independent of this change - undermines performance and has led to workarounds (get_fast_pcie_static_tlb_write_callable and tt::Writer).

As an interim change along a pathway to a better API, I have no objection.

device/tt_device.h Outdated Show resolved Hide resolved
device/tt_device.h Show resolved Hide resolved
device/tt_silicon_driver.cpp Show resolved Hide resolved
@broskoTT
Copy link
Contributor Author

  • undermines performance and has led to workarounds (get_fast_pcie_static_tlb_write_callable and tt::Writer).

I completely agree. But it is fine for me to have a more flexible API for slower access, and a performant option (such as tt::Writer, or as you suggested TLBWindow, which I ended up suggesting be AbstractIO in the end).

On the other hand I do want to investigate maps that we use a bit. Theoretically, it shouldn't be expensive to use a hash map. It is known to have a bad implementation in stdlib though, there are much better options. I also suspect that just changing everything to map and set would improve perf (they should be more performant for very small containers, which are ours). Hash maps are not slow by design, and I'd very much like to get to the bottom of where the perf issue really is.

@broskoTT
Copy link
Contributor Author

Related tenstorrent/tt-metal#13949 got approved, so now merging this PR.

@broskoTT broskoTT merged commit b473c5d into main Oct 23, 2024
17 checks passed
@broskoTT broskoTT deleted the brosko/tlb_map_per_chip branch October 23, 2024 09:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changes api API changing PR, needs changes in client code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants